SEQUENCE LISTING 



<110> Koffas, Mattheos 
Odom, J. Martin 
Schenzle, Andreas J. 
Norton, Kelley C. 
Tomb, Jean- Francois 
Rouviere, Pierre 
Picataggio, Stephen 
Cheng, Qiong 



<12 0> Genes Involved in Isoprenoid Compounds Production 

<130> CL1646 US NA 

<140> 
<141> 

<150> 60/229,907 

<151> September 1, 2001 

<160> 24 

<170> Microsoft Office 97 

<210> 1 
<211> 1860 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<223> ORF1 



<400> 1 

atgaaactga 

gcgctgtcca 

acggtcagca 

gccttgcatt 

gcctatccgc 

ggcggggtgt 

cattccagca 

gaagacaaga 

gaggcgatga 

gatatgtcga 

agcaagtttt 

gtgtgggaac 

ttgttcgagg 
ctggtgtcga 
accaagaagg 
ccggctttcg 
tataccgagg 
ggcatcacgc 
aatcgctatt 
gcctgccagg 
gatcagttga 
gccggcttgg 
cgctgtattc 
ctgaccaccg 



ccaccgacta 
aggaccagct 
tttccggcgg 
atgtgttcaa 
acaagattct 
cagcctttcc 
cctcgatcag 
agatggtagc 
atcatgccgg 
tctcgccgcc 
attcgtcggt 
tggcgcgcaa 
aattgggctt 
ccctggaaaa 
gcaaaggcta 
atccgaccaa 
tgttcggccg 
cggcgatgcg 
tcgatgtcgc 
gcgccaagcc 
tccacgacgt 
tcggcccgga 
cgaacatgct 
gcttccaaca 



tcccttgctt 
ccagcaactg 
ccattttgcg 
tacccccgtc 
gaccggtcgc 
ggcgcgggac 
cgcggcactg 
catcatcggc 
cgatgtgaat 
ggtcggggcg 
gcgggaagag 
gaccgaggaa 
caattatttc 
tctgaaggat 
tgcgccagcc 
ggatttcctg 
ctggctgtgc 
cgaaggctct 
catcgccgag 
ggtggtggcg 
ggccttgcag 
tggaccgacc 
gatcatggct 
ccatggcccg 



aaaaacatcc 
gctgacgagg 
gccggcctcg 
gatcagttgg 
aaggagcgca 
gagagcgaat 
ggcatggcca 
gacggttcca 
gccaacctgc 
atgaacaatt 
agcaagaaag 
cacgtgaagg 
ggcccgatcg 
ttgaccgggc 
gagaaagacc 
cccaaggcgg 
gacatggcgg 
ggtttggtgg 
cagcatgcgg 
atttattcca 
aacttagata 
catgctggcg 
ccagccgacg 
gcttcggtgc 



acacgccggc 
tgcgcggcta 
gcaccgtgga 
tctgggacgt 
tgccgaccat 
acgatgcctt 
ttgcgtcgca 
tcaccggcgg 
tggtgatctt 
atctgaccaa 
ctctggccaa 
gcatgatcgt 
acggccatga 
cggtattcct 
cgttggccta 
cgccgtcgcc 
ctcaagacga 
aattctcaca 
tgaccttggc 
ccttcctgca 
tgctctttgc 
cctttgatta 
agaacgagtg 
gctatccgcg 



ggacatacgc 
tctgacccac 
actgaccgtg 
gggccatcag 
tcgcaccctg 
cggcgtcggc 
gctgcgcggc 
catggcctat 
gaacgacaac 
ggtgttgtcg 
gatgccgtcg 
gcccggtacc 
tgtcgagatg 
gcatgtggtg 
ccatggcgtg 
gcatccgacc 
gcgcttgctg 
gaaatttccg 
cgccggccag 
acgcggttac 
actggatcgt 
cagctacatg 
caggcagatg 
cggcaaaggg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



1 



cccggggcgg caatcgatcc gaccctgacc 
caccacggca gccgcatcgc cattctggcc 
gccggcaagc agctgggcgc gacggtggtg 
gccttggtgc tggaattggc caggacgcac 
atcgccggcg gcgctggcag tgcgatcaac 
ccggtctgca acatcggcct gcccgaccgc 
ctcagcctgg tcggcctcga cagcaagggc 



gcgctggaga tcggcaaggc cgaagtcaga 1500 
tggggcagca tggtcacgcc tgccgtcgaa 1560 
aacatgcgtt tcgtcaagcc gttcgatcaa 1620 
gatgtgttcg tcaccgtcga ggaaaacgtc 1680 
accttcctgc aggcgcagaa ggtgctgatg 1740 
ttcgtcgagc aaggtagtcg cgaggaattg 1800 
atcctcgcca ccatcgaaca gttttgcgct 1860 



<210> 2 
<211> 620 
<212> PRT 

<213> Methylomonas 16a 
<220> 

<223> Amino acid sequences encoded by ORF1 
<400> 2 

Met Lys Leu Thr Thr Asp Tyr Pro Leu Leu Lys Asn lie His Thr Pro 
15 10 15 

Ala Asp lie Arg Ala Leu Ser Lys Asp Gin Leu Gin Gin Leu Ala Asp 
20 25 30 

Glu Val Arg Gly Tyr Leu Thr His Thr Val Ser lie Ser Gly Gly His 
35 40 45 

Phe Ala Ala Gly Leu Gly Thr Val Glu Leu Thr Val Ala Leu His Tyr 
50 55 60 

Val Phe Asn Thr Pro Val Asp Gin Leu Val Trp Asp Val Gly His Gin 
65 70 75 80 

Ala Tyr Pro His Lys lie Leu Thr Gly Arg Lys Glu Arg Met Pro Thr 
85 90 95 

lie Arg Thr Leu Gly Gly Val Ser Ala Phe Pro Ala Arg Asp Glu Ser 
100 105 110 

Glu Tyr Asp Ala Phe Gly Val Gly His Ser Ser Thr Ser lie Ser Ala 
115 120 125 

Ala Leu Gly Met Ala lie Ala Ser Gin Leu Arg Gly Glu Asp Lys Lys 
130 135 140 

Met Val Ala lie lie Gly Asp Gly Ser lie Thr Gly Gly Met Ala Tyr 
145 150 155 160 

Glu Ala Met Asn His Ala Gly Asp Val Asn Ala Asn Leu Leu Val lie 
165 170 175 

Leu Asn Asp Asn Asp Met Ser lie Ser Pro Pro Val Gly Ala Met Asn 
180 185 190 

Asn Tyr Leu Thr Lys Val Leu Ser Ser Lys Phe Tyr Ser Ser Val Arg 
195 200 205 

Glu Glu Ser Lys Lys Ala Leu Ala Lys Met Pro Ser Val Trp Glu Leu 
210 215 220 



2 



Ala Arg Lys Thr 
225 

Leu Phe Glu Glu 



Asp Val Glu Met 
260 

Gly Pro Val Phe 
275 

Pro Ala Glu Lys 
290 

Pro Thr Lys Asp 
305 

Tyr Thr Glu Val 



Glu Arg Leu Leu 
340 

Val Glu Phe Ser 
355 

Ala Glu Gin His 
370 

Ala Lys Pro Val 
385 

Asp Gin Leu lie 



Ala Leu Asp Arg 
420 

Gly Ala Phe Asp 
435 

Met Ala Pro Ala 
450 

Phe Gin His His 
465 

Pro Gly Ala Ala 



Ala Glu Val Arg 
500 

Ser Met Val Thr 
515 

Val Val Asn Met 
530 



Glu Glu His Val 
230 

Leu Gly Phe Asn 
245 

Leu Val Ser Thr 



Leu His Val Val 
280 

Asp Pro Leu Ala 
295 

Phe Leu Pro Lys 
310 

Phe Gly Arg Trp 
325 

Gly lie Thr Pro 



Gin Lys Phe Pro 
360 

Ala Val Thr Leu 
375 

Val Ala lie Tyr 
390 

His Asp Val Ala 
405 

Ala Gly Leu Val 



Tyr Ser Tyr Met 
440 

Asp Glu Asn Glu 
455 

Gly Pro Ala Ser 
470 

lie Asp Pro Thr 
485 

His His Gly Ser 



Pro Ala Val Glu 
520 

Arg Phe Val Lys 
535 



Lys Gly Met lie 
235 

Tyr Phe Gly Pro 
250 

Leu Glu Asn Leu 
265 

Thr Lys Lys Gly 



Tyr His Gly Val 
300 

Ala Ala Pro Ser 
315 

Leu Cys Asp Met 
330 

Ala Met Arg Glu 
345 

Asn Arg Tyr Phe 



Ala Ala Gly Gin 
380 

Ser Thr Phe Leu 
395 

Leu Gin Asn Leu 
410 

Gly Pro Asp Gly 
425 

Arg Cys lie Pro 



Cys Arg Gin Met 
460 

Val Arg Tyr Pro 
475 

Leu Thr Ala Leu 
490 

Arg lie Ala lie 
505 

Ala Gly Lys Gin 



Pro Phe Asp Gin 
540 



val Pro Gly Thr 
240 

lie Asp Gly His 
255 

Lys Asp Leu Thr 
270 

Lys Gly Tyr Ala 
285 

Pro Ala Phe Asp 



Pro His Pro Thr 
320 

Ala Ala Gin Asp 
335 

Gly Ser Gly Leu 
350 

Asp Val Ala lie 
365 

Ala Cys Gin Gly 



Gin Arg Gly Tyr 
400 

Asp Met Leu Phe 
415 

Pro Thr His Ala 
430 

Asn Met Leu lie 
445 

Leu Thr Thr Gly 



Arg Gly Lys Gly 
480 

Glu lie Gly Lys 
495 

Leu Ala Trp Gly 
510 

Leu Gly Ala Thr 
525 

Ala Leu Val Leu 
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Glu Leu Ala Arg Thr His Asp Val Phe Val Thr Val Glu Glu Asn Val 
545 550 555 560 



lie Ala Gly Gly Ala Gly Ser Ala 
565 

Lys Val Leu Met Pro Val Cys Asn 
580 

Glu Gin Gly Ser Arg Glu Glu Leu 
595 600 

Lys Gly lie Leu Ala Thr lie Glu 
610 615 



lie Asn Thr Phe Leu Gin Ala Gin 
570 575 

lie Gly Leu Pro Asp Arg Phe Val 
585 590 

Leu Ser Leu Val Gly Leu Asp Ser 
605 

Gin Phe Cys Ala 
620 



<210> 3 
<211> 1182 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<223> ORF2 
<400> 3 

atgaaaggta tttgcatatt gggcgctacc ggttcgatcg gtgtcagcac gctggatgtc 60 

gttgccaggc atccggataa atatcaagtc gttgcgctga ccgccaacgg caatatcgac 120 

gcattgtatg aacaatgcct ggcccaccat ccggagtatg cggtggtggt catggaaagc 180 

aaggtagcag agttcaaaca gcgcattgcc gcttcgccgg tagcggatat caaggtcttg 240 

tcgggtagcg aggccttgca acaggtggcc acgctggaaa acgtcgatac ggtgatggcg 300 

gctatcgtcg gcgcggccgg attgttgccg accttggccg cggccaaggc cggcaaaacc 360 

gtgctgttgg ccaacaagga agccttggtg atgtcgggac aaatcttcat gcaggccgtc 42 0 

agcgattccg gcgctgtgtt gctgccgata gacagcgagc acaacgccat ctttcagtgc 480 

atgccggcgg gttatacgcc aggccataca gccaaacagg cgcgccgcat tttattgacc 54 0 

gcttccggtg gcccatttcg acggacgccg atagaaacgt tgtccagcgt cacgccggat 600 

caggccgttg cccatcctaa atgggacatg gggcgcaaga tttcggtcga ttccgccacc 660 

atgatgaaca aaggtctcga actgatcgaa gcctgcttgt tgttcaacat ggagcccgac 720 

cagattgaag tcgtcattca tccgcagagc atcattcatt cgatggtgga ctatgtcgat 780 

ggttcggttt tggcgcagat gggtaatccc gacatgcgca cgccgatagc gcacgcgatg 840 

gcctggccgg aacgctttga ctctggtgtg gcgccgctgg atattttcga agtagggcac 900 

atggatttcg aaaaacccga cttgaaacgg tttccttgtc tgagattggc ttatgaagcc 960 

atcaagtctg gtggaattat gccaacggta ttgaacgcag ccaatgaaat tgctgtcgaa 1020 

gcgtttttaa atgaagaagt caaattcact gacatcgcgg tcatcatcga gcgcagcatg 1080 

gcccagttta aaccggacga tgccggcagc ctcgaattgg ttttgcaggc cgatcaagat 1140 

gcgcgcgagg tggctagaga catcatcaag accttggtag ct 1182 



<210> 4 
<211> 394 
<212> PRT 

<213> Methylomonas 16a 
<220> 

<22 3> Amino acid sequences encoded by ORF2 
<400> 4 

Met Lys Gly lie Cys lie Leu Gly Ala Thr Gly Ser lie Gly Val Ser 
15 10 15 
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Thr Leu Asp Val Val Ala Arg His Pro Asp Lys Tyr Gin Val Val Ala 
20 25 30 



Leu Thr Ala Asn Gly Asn lie Asp Ala Leu Tyr Glu Gin Cys Leu Ala 
35 40 45 

His His Pro Glu Tyr Ala Val Val Val Met Glu Ser Lys Val Ala Glu 
50 55 60 

Phe Lys Gin Arg lie Ala Ala Ser Pro Val Ala Asp lie Lys Val Leu 
65 70 75 80 

Ser Gly Ser Glu Ala Leu Gin Gin Val Ala Thr Leu Glu Asn Val Asp 
85 90 95 

Thr Val Met Ala Ala lie Val Gly Ala Ala Gly Leu Leu Pro Thr Leu 
100 105 110 

Ala Ala Ala Lys Ala Gly Lys Thr Val Leu Leu Ala Asn Lys Glu Ala 
115 120 125 

Leu Val Met Ser Gly Gin lie Phe Met Gin Ala Val Ser Asp Ser Gly 
130 13 5 14 0 

Ala Val Leu Leu Pro lie Asp Ser Glu His Asn Ala lie Phe Gin Cys 
145 150 155 160 

Met Pro Ala Gly Tyr Thr Pro Gly His Thr Ala Lys Gin Ala Arg Arg 
165 170 175 

lie Leu Leu Thr Ala Ser Gly Gly Pro Phe Arg Arg Thr Pro lie Glu 
180 185 190 

Thr Leu Ser Ser Val Thr Pro Asp Gin Ala Val Ala His Pro Lys Trp 
195 200 205 

Asp Met Gly Arg Lys lie Ser Val Asp Ser Ala Thr Met Met Asn Lys 
210 215 220 

Gly Leu Glu Leu lie Glu Ala Cys Leu Leu Phe Asn Met Glu Pro Asp 
225 230 235 240 

Gin lie Glu Val Val lie His Pro Gin Ser lie lie His Ser Met Val 
245 250 255 

Asp Tyr Val Asp Gly Ser Val Leu Ala Gin Met Gly Asn Pro Asp Met 
260 265 270 

Arg Thr Pro lie Ala His Ala Met Ala Trp Pro Glu Arg Phe Asp Ser 
275 280 285 

Gly Val Ala Pro Leu Asp lie Phe Glu Val Gly His Met Asp Phe Glu 
290 295 300 

Lys Pro Asp Leu Lys Arg Phe Pro Cys Leu Arg Leu Ala Tyr Glu Ala 
305 310 315 320 

lie Lys Ser Gly Gly lie Met Pro Thr Val Leu Asn Ala Ala Asn Glu 
325 330 335 



5 



lie Ala Val Glu Ala Phe 
340 

Ala Val lie lie Glu Arg 
355 

Gly Ser Leu Glu Leu Val 
370 

Ala Arg Asp lie lie Lys 
385 390 



Leu Asn Glu Glu Val 
345 

Ser Met Ala Gin Phe 
360 

Leu Gin Ala Asp Gin 
375 

Thr Leu Val Ala 



Lys Phe Thr Asp lie 
350 

Lys Pro Asp Asp Ala 
365 

Asp Ala Arg Glu Val 
380 



<210> 5 
<211> 693 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<223> ORF3 
<400> 5 

atgaacccaa ccatccaatg ctgggccgtc 
caagccgatc gccccaaaca atatttaccg 
ctgactcgac tacttgagtc cgacgccttc 
gacccttatt ggcctgaact gtccatagcc 
ggcaaggaac gcgccgactc ggtgctgtct 
gaaaatgatt gggtgctggt acacgacgcc 
caccttcaaa tcgatacctt aaaaaatgac 
cacgacacat tgaaacacgt ggatggtgac 
gtctggcgcg ccttgacgcc gcaaatgttc 
cgaaccgaag gcaatccggc cgtcaccgac 
aaacccaaaa tcgtggaagg ccgcccggac 
gccctggcac aattttatat ggagcaacaa 



gtgcccgcag ccggcgtcgg caaacgcatg 60 
cttgccggta aaacggtcat cgaacacaca 120 
caaaaagttg cggtggcgat ttccgtcgaa 180 
aaacaccccg acatcatcac cgcgcctggc 240 
gcactgaagg ctttagaaga tatagccagc 3 00 
gcccgcccct gcttgacggg cagcgacatc 360 
ccggtcggcg gcatcctggc cttgagttcg 420 
acgatcaccg caaccataga cagaaagcac 4 80 
aaatacggca tgttgcgcga cgcgttgcaa 54 0 
gaagccagtg cgctggaact tttgggccat 600 
aacatcaaaa tcacccgccc ggaagatttg 660 
gca 693 



<210> 6 
<211> 231 
<212> PRT 

<213> Methylomonas 16a 
<220> 

<223> Amino acid sequences encoded by ORF3 
<400> 6 

Met Asn Pro Thr lie Gin Cys Trp Ala Val Val Pro Ala Ala Gly Val 
15 10 15 

Gly Lys Arg Met Gin Ala Asp Arg Pro Lys Gin Tyr Leu Pro Leu Ala 
20 25 30 

Gly Lys Thr Val lie Glu His Thr Leu Thr Arg Leu Leu Glu Ser Asp 
35 40 45 

Ala Phe Gin Lys Val Ala Val Ala lie Ser Val Glu Asp Pro Tyr Trp 
50 55 60 

Pro Glu Leu Ser He Ala Lys His Pro Asp He He Thr Ala Pro Gly 
65 70 75 80 
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Gly Lys Glu Arg Ala Asp Ser Val Leu Ser Ala Leu Lys Ala Leu Glu 
85 90 95 



Asp lie Ala Ser Glu Asn Asp Trp Val Leu Val His Asp Ala Ala Arg 
100 105 110 

Pro Cys Leu Thr Gly Ser Asp lie His Leu Gin lie Asp Thr Leu Lys 
115 120 125 

Asn Asp Pro Val Gly Gly lie Leu Ala Leu Ser Ser His Asp Thr Leu 
130 135 140 

Lys His Val Asp Gly Asp Thr lie Thr Ala Thr lie Asp Arg Lys His 
145 150 155 160 

Val Trp Arg Ala Leu Thr Pro Gin Met Phe Lys Tyr Gly Met Leu Arg 
165 170 175 

Asp Ala Leu Gin Arg Thr Glu Gly Asn Pro Ala Val Thr Asp Glu Ala 
180 185 190 

Ser Ala Leu Glu Leu Leu Gly His Lys Pro Lys lie Val Glu Gly Arg 
195 200 205 

Pro Asp Asn lie Lys lie Thr Arg Pro Glu Asp Leu Ala Leu Ala Gin 
210 215 220 

Phe Tyr Met Glu Gin Gin Ala 
225 230 



<210> 7 
<211> 855 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<22 3> ORF4 
<400> 7 

atggattatg cggctgggtg gggcgaaaga tggcctgctc cggcaaaatt gaacttaatg 60 
ttgaggatta ccggtcgcag gccagatggc tatcatctgt tgcaaacggt gtttcaaatg 120 
ctcgatctat gcgattggtt gacgtttcat ccggttgatg atggccgcgt gacgctgcga 180 
aatccaatct ccggcgttcc agagcaggat gacttgactg ttcgggcggc taatttgttg 240 
aagtctcata ccggctgtgt gcgcggagtt tgtatcgata tcgagaaaaa tctgcctatg 3 00 
ggtggtggtt tgggtggtgg aagttccgat gctgctacaa ccttggtagt tctaaatcgg 360 
ctttggggct tgggcttgtc gaagcgtgag ttgatggatt tgggcttgag gcttggtgcc 420 
gatgtgcctg tgtttgtgtt tggttgttcg gcctggggcg aaggtgtgag cgaggatttg 480 
caggcaataa cgttgccgga acaatggttt gtcatcatta aaccggattg ccatgtgaat 54 0 
actggagaaa ttttttctgc agaaaatttg acaaggaata gtgcagtcgt tacaatgagc 600 
gactttcttg caggggataa tcggaatgat tgttcggaag tggtttgcaa gttatatcga 660 
ccggtgaaag atgcaatcga tgcgttgtta tgctatgcgg aagcgagatt gacggggacc 720 
ggtgcatgtg tgttcgctca gttttgtaac aaggaagatg ctgagagtgc gttagaagga 780 
ttgaaagatc ggtggctggt gttcttggct aaaggcttga atcagtctgc gctctacaag 840 
aaattagaac aggga 855 



<210> 8 
<211> 285 
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<212> PRT 

<213> Methylomonas 16a 



<220> 

<223> Amino acid sequences encoded by ORF4 
<400> 8 

Met Asp Tyr Ala Ala Gly Trp Gly Glu Arg Trp Pro Ala Pro Ala Lys 
15 10 15 

Leu Asn Leu Met Leu Arg lie Thr Gly Arg Arg Pro Asp Gly Tyr His 
20 25 30 

Leu Leu Gin Thr Val Phe Gin Met Leu Asp Leu Cys Asp Trp Leu Thr 
35 40 45 

Phe His Pro Val Asp Asp Gly Arg Val Thr Leu Arg Asn Pro lie Ser 
50 55 60 

Gly Val Pro Glu Gin Asp Asp Leu Thr Val Arg Ala Ala Asn Leu Leu 
65 70 75 80 

Lys Ser His Thr Gly Cys Val Arg Gly Val Cys lie Asp lie Glu Lys 
85 90 95 

Asn Leu Pro Met Gly Gly Gly Leu Gly Gly Gly Ser Ser Asp Ala Ala 
100 105 110 

Thr Thr Leu Val Val Leu Asn Arg Leu Trp Gly Leu Gly Leu Ser Lys 
115 120 125 

Arg Glu Leu Met Asp Leu Gly Leu Arg Leu Gly Ala Asp Val Pro Val 
130 135 140 

Phe Val Phe Gly Cys Ser Ala Trp Gly Glu Gly Val Ser Glu Asp Leu 
145 150 155 160 

Gin Ala lie Thr Leu Pro Glu Gin Trp Phe Val lie lie Lys Pro Asp 
165 170 175 

Cys His Val Asn Thr Gly Glu lie Phe Ser Ala Glu Asn Leu Thr Arg 
180 185 190 

Asn Ser Ala Val Val Thr Met Ser Asp Phe Leu Ala Gly Asp Asn Arg 
195 200 205 

Asn Asp Cys Ser Glu Val Val Cys Lys Leu Tyr Arg Pro Val Lys Asp 
210 215 220 

Ala lie Asp Ala Leu Leu Cys Tyr Ala Glu Ala Arg Leu Thr Gly Thr 
225 230 235 240 

Gly Ala Cys Val Phe Ala Gin Phe Cys Asn Lys Glu Asp Ala Glu Ser 
245 250 255 

Ala Leu Glu Gly Leu Lys Asp Arg Trp Leu Val Phe Leu Ala Lys Gly 
260 265 270 

Leu Asn Gin Ser Ala Leu Tyr Lys Lys Leu Glu Gin Gly 
275 280 285 
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<210> 9 
<211> 471 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<223> 0RF5 
<400> 9 

atgatacgcg taggcatggg ttacgacgtg 
ttgggcggcg tcaaaatccc ttatgaaaaa 
gtgctgcacg cattggccga cgccatcttg 
catttcccgg acaccgaccc caatttcaag 
gtgtacggca tcgtcaagga aaaaggctat 
gctcaggcgc cgaagatgct gccacacgtg 
ctggaaaccg atgtcgattt cattaatgta 
gagggccgta aggaaggcat cgccgtgcag 



caccgtttca acgacggcga ccacatcatt 60 
ggcctggaag cccattccga cggcgacgtg 12 0 
ggagccgccg ctttgggcga catcggcaaa 180 
ggcgccgaca gcagggtgct actgcgccac 24 0 
aaactggtca acgccgacgt gaccatcatc 3 00 
cccggcatgc gcgccaacat tgccgccgat 360 
aaagccacga cgaccgagaa actgggcttt 4 20 
gctgtggtgt tgatagaacg c 471 



<210> 10 
<211> 157 
<212> PRT 

<213> Methylomonas 16a 
<220> 

<223> Amino acid sequences encoded by ORF5 
<400> 10 

Met lie Arg Val Gly Met Gly Tyr Asp Val His Arg Phe Asn Asp Gly 
1 5 10 15 

Asp His lie lie Leu Gly Gly Val Lys lie Pro Tyr Glu Lys Gly Leu 
20 25 30 

Glu Ala His Ser Asp Gly Asp Val Val Leu His Ala Leu Ala Asp Ala 
35 40 45 

lie Leu Gly Ala Ala Ala Leu Gly Asp lie Gly Lys His Phe Pro Asp 
50 55 60 

Thr Asp Pro Asn Phe Lys Gly Ala Asp Ser Arg Val Leu Leu Arg His 
65 70 75 80 

Val Tyr Gly lie Val Lys Glu Lys Gly Tyr Lys Leu Val Asn Ala Asp 
85 90 95 

Val Thr lie lie Ala Gin Ala Pro Lys Met Leu Pro His Val Pro Gly 
100 105 110 

Met Arg Ala Asn lie Ala Ala Asp Leu Glu Thr Asp Val Asp Phe lie 
115 120 125 

Asn Val Lys Ala Thr Thr Thr Glu Lys Leu Gly Phe Glu Gly Arg Lys 
130 135 140 

Glu Gly lie Ala Val Gin Ala Val Val Leu lie Glu Arg 
145 150 155 
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<210> 11 
<211> 1632 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<223> ORF6 
<400> 11 

atgacaaaat tcatctttat caccggcggc gtggtgtcat ccttgggaaa agggatagcc 60 
gcctcctccc tggcggcgat tctggaagac cgcggcctca aagtcactat cacaaaactc 120 
gatccctaca tcaacgtcga ccccggcacc atgagcccgt ttcaacacgg cgaggtgttc 180 
gtgaccgaag acggtgccga aaccgatttg gaccttggcc attacgaacg gtttttgaaa 240 
accacgatga ccaagaaaaa caacttcacc accggtcagg tttacgagca ggtattacgc 300 
aacgagcgca aaggtgatta tcttggcgcg accgtgcaag tcattccaca tatcaccgac 360 
gaaatcaaac gccgggtgta tgaaagcgcc gaagggaaag atgtggcatt gatcgaagtc 420 
ggcggcacgg tgggcgacat cgaatcgtta ccgtttctgg aaaccatacg ccagatgggc 4 80 
gtggaactgg gtcgtgaccg cgccttgttc attcatttga cgctggtgcc ttacatcaaa 540 
tcggccggcg aactgaaaac caagcccacc cagcattcgg tcaaagaact gcgcaccatc 600 
gggattcagc cggacatttt gatctgtcgt tcagaacaac cgatcccggc cagtgaacgc 660 
cgcaagatcg cgctatttac caatgtcgcc gaaaaggcgg tgatttccgc gatcgatgcc 72 0 
gacaccattt accgcattcc gctattgctg cgcgaacaag gcctggacga cctggtggtc 780 
gatcagttgc gcctggacgt accagcggcg gat t tat egg cctgggaaaa ggtegtcgat 84 0 
ggcctgactc atccgaccga cgaagtcagc attgegateg teggtaaata tgtcgaccac 900 
accgatgcct acaaatcget gaatgaagee ctgattcatg ccggcattca cacgcgccac 960 
aaggtgcaaa tcagctacat cgactccgaa accatagaag ccgaaggcac cgccaaattg 102 0 
aaaaaegteg atgegatect ggtgccgggt ggtttcggcg aacgeggegt ggaaggcaag 1080 
atttctaccg tgcgttttgc ccgcgagaac aaaatcccgt atttgggcat ttgcttgggc 114 0 
atgeaategg eggtaatega attcgcccgc aacgtggttg gectggaagg cgcgcacagc 1200 
accgaattcc tgecgaaate gccacaccct gtgategget tgatcaccga atggatggac 1260 
gaagccggcg aactggtcac aegegacgaa gattccgatc tgggeggcac gatgegtctg 132 0 
ggcgcgcaaa aatgccgcct gaaggctgat tccttggctt ttcagttgta tcaaaaagac 1380 
gtcatcaccg agcgtcaccg ccaccgctac gaattcaaca atcaatattt aaaacaactg 1440 
gaagcggccg gcatgaaatt ttccggtaaa tcgctggacg gccgcctggt ggagatcatc 1500 
gagctacccg aacacccctg gttcctggcc tgccagttcc atcccgaatt cacctcgacg 1560 
ccgcgtaacg gccacgccct atttteggge ttcgtcgaag cggccgccaa acacaaaaca 162 0 
caaggcacag ca 1632 



<210> 12 
<211> 544 
<212> PRT 

<213> Methylomonas 16a 
<220> 

<22 3> Amino acid sequences encoded by ORF6 
<400> 12 

Met Thr Lys Phe lie Phe lie Thr Gly Gly Val Val Ser Ser Leu Gly 
15 10 15 

Lys Gly lie Ala Ala Ser Ser Leu Ala Ala lie Leu Glu Asp Arg Gly 
20 25 30 

Leu Lys Val Thr lie Thr Lys Leu Asp Pro Tyr lie Asn Val Asp Pro 
35 40 45 

Gly Thr Met Ser Pro Phe Gin His Gly Glu Val Phe Val Thr Glu Asp 
50 55 60 
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Gly Ala Glu Thr 
65 

Thr Thr Met Thr 



Gin Val Leu Arg 
100 

Gin Val lie Pro 
115 

Ser Ala Glu Gly 
130 

Gly Asp lie Glu 
145 

Val Glu Leu Gly 



Pro Tyr lie Lys 
180 

Ser Val Lys Glu 
195 

Cys Arg Ser Glu 
210 

Leu Phe Thr Asn 
225 

Asp Thr lie Tyr 



Asp Leu Val Val 
260 

Ser Ala Trp Glu 
275 

Val Ser lie Ala 
290 

Lys Ser Leu Asn 
305 

Lys Val Gin lie 



Thr Ala Lys Leu 
340 

Gly Glu Arg Gly 
355 

Glu Asn Lys lie 
370 



Asp Leu Asp Leu 
70 

Lys Lys Asn Asn 
85 

Asn Glu Arg Lys 



His lie Thr Asp 
120 

Lys Asp Val Ala 
135 

Ser Leu Pro Phe 
150 

Arg Asp Arg Ala 
165 

Ser Ala Gly Glu 



Leu Arg Thr lie 
200 

Gin Pro lie Pro 
215 

Val Ala Glu Lys 
230 

Arg lie Pro Leu 
245 

Asp Gin Leu Arg 



Lys Val Val Asp 
280 

He Val Gly Lys 
295 

Glu Ala Leu He 
310 

Ser Tyr lie Asp 
325 

Lys Asn Val Asp 



Val Glu Gly Lys 
360 

Pro Tyr Leu Gly 
375 



Gly His Tyr Glu 
75 

Phe Thr Thr Gly 
90 

Gly Asp Tyr Leu 
105 

Glu He Lys Arg 



Leu He Glu Val 
140 

Leu Glu Thr He 
155 

Leu Phe He His 
170 

Leu Lys Thr Lys 
185 

Gly He Gin Pro 



Ala Ser Glu Arg 
220 

Ala Val He Ser 
235 

Leu Leu Arg Glu 
250 

Leu Asp Val Pro 
265 

Gly Leu Thr His 



Tyr Val Asp His 
300 

His Ala Gly He 
315 

Ser Glu Thr He 
330 

Ala He Leu Val 
345 

He Ser Thr Val 



He Cys Leu Gly 
380 



Arg Phe Leu Lys 
80 

Gin Val Tyr Glu 
95 

Gly Ala Thr Val 
110 

Arg Val Tyr Glu 
125 

Gly Gly Thr Val 



Arg Gin Met Gly 
160 

Leu Thr Leu Val 
175 

Pro Thr Gin His 
190 

Asp He Leu He 
205 

Arg Lys He Ala 



Ala He Asp Ala 
240 

Gin Gly Leu Asp 
255 

Ala Ala Asp Leu 
270 

Pro Thr Asp Glu 
285 

Thr Asp Ala Tyr 



His Thr Arg His 
320 

Glu Ala Glu Gly 
335 

Pro Gly Gly Phe 
350 

Arg Phe Ala Arg 
365 

Met Gin Ser Ala 
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Val He Glu Phe 
385 

Thr Glu Phe Leu 



Glu Trp Met Asp 
420 

Asp Leu Gly Gly 
435 

Ala Asp Ser Leu 
450 

Arg His Arg His 
465 

Glu Ala Ala Gly 



Val Glu He He 
500 

Phe His Pro Glu 
515 

Ser Gly Phe Val 
530 



Ala Arg Asn Val 
390 

Pro Lys Ser Pro 
405 

Glu Ala Gly Glu 



Thr Met Arg Leu 
440 

Ala Phe Gin Leu 
455 

Arg Tyr Glu Phe 
47a 

Met Lys Phe Ser 
485 

Glu Leu Pro Glu 



Phe Thr Ser Thr 
520 

Glu Ala Ala Ala 
535 



Val Gly Leu Glu 
395 

His Pro Val He 
410 

Leu Val Thr Arg 
425 

Gly Ala Gin Lys 



Tyr Gin Lys Asp 
460 

Asn Asn Gin Tyr 
475 

Gly Lys Ser Leu 
490 

His Pro Trp Phe 
505 

Pro Arg Asn Gly 



Lys His Lys Thr 
540 



Gly Ala His Ser 
400 

Gly Leu He Thr 
415 

Asp Glu Asp Ser 
430 

Cys Arg Leu Lys 
445 

Val He Thr Glu 



Leu Lys Gin Leu 
480 

Asp Gly Arg Leu 
495 

Leu Ala Cys Gin 
510 

His Ala Leu Phe 
525 

Gin Gly Thr Ala 



<210> 13 
<211> 891 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<223> 0RF7 
<400> 13 

atgagtaaat tgaaagccta cctgaccgtc tgccaagaac gcgtcgagcg cgcgctggac 60 
gcccgtctgc ctgccgaaaa catactgcca caaaccttgc atcaggccat gcgctattcc 120 
gtattgaacg gcggcaaacg cacccggccc ttgttgactt atgcgaccgg tcaggctttg 180 
ggcttgccgg aaaacgtgct ggatgcgccg gcttgcgcgg tagaattcat ccatgtgtat 240 
tcgctgattc acgacgatct gccggccatg gacaacgatg atctgcgccg cggcaaaccg 3 00 
acctgtcaca aggcttacga cgaggccacc gccattttgg ccggcgacgc actgcaggcg 360 
ctggcctttg aagttctggc caacgacccc ggcatcaccg tcgatgcccc ggctcgcctg 420 
aaaatgatca cggctttgac ccgcgccagc ggctctcaag gcatggtggg cggtcaagcc 4 80 
atcgatctcg gctccgtcgg ccgcaaattg acgctgccgg aactcgaaaa catgcatatc 540 
cacaagactg gcgccctgat ccgcgccagc gtcaatctgg cggcattatc caaacccgat 600 
ctggatactt gcgtcgccaa gaaactggat cactatgcca aatgcatagg cttgtcgttc 660 
caggtcaaag acgacattct cgacatcgaa gccgacaccg cgacactcgg caagactcag 720 
ggcaaggaca tcgataacga caaaccgacc taccctgcgc tattgggcat ggctggcgcc 780 
aaacaaaaag cccaggaatt gcacgaacaa gcagtcgaaa gcttaacggg atttggcagc 840 
gaagccgacc tgctgcgcga actatcgctt tacatcatcg agcgcacgca c 891 



<210> 14 
<211> 297 



12 



<212> PRT 

<213> Methylomonas 16a 



<220> 

<223> Amino acid sequences encoded by ORF7 
<400> 14 

Met Ser Lys Leu Lys Ala Tyr Leu Thr Val Cys Gin Glu Arg Val Glu 
1 5 10 15 

Arg Ala Leu Asp Ala Arg Leu Pro Ala Glu Asn lie Leu Pro Gin Thr 
20 25 30 

Leu His Gin Ala Met Arg Tyr Ser Val Leu Asn Gly Gly Lys Arg Thr 
35 40 45 

Arg Pro Leu Leu Thr Tyr Ala Thr Gly Gin Ala Leu Gly Leu Pro Glu 
50 55 60 

Asn Val Leu Asp Ala Pro Ala Cys Ala Val Glu Phe lie His Val Tyr 
65 70 75 80 

Ser Leu lie His Asp Asp Leu Pro Ala Met Asp Asn Asp Asp Leu Arg 
85 90 95 

Arg Gly Lys Pro Thr Cys His Lys Ala Tyr Asp Glu Ala Thr Ala lie 
100 105 110 

Leu Ala Gly Asp Ala Leu Gin Ala Leu Ala Phe Glu Val Leu Ala Asn 
115 120 125 

Asp Pro Gly lie Thr Val Asp Ala Pro Ala Arg Leu Lys Met lie Thr 
130 135 140 

Ala Leu Thr Arg Ala Ser Gly Ser Gin Gly Met Val Gly Gly Gin Ala 
145 150 155 160 

lie Asp Leu Gly Ser Val Gly Arg Lys Leu Thr Leu Pro Glu Leu Glu 
165 170 175 

Asn Met His lie His Lys Thr Gly Ala Leu lie Arg Ala Ser Val Asn 
180 185 190 

Leu Ala Ala Leu Ser Lys Pro Asp Leu Asp Thr Cys Val Ala Lys Lys 
195 200 205 

Leu Asp His Tyr Ala Lys Cys lie Gly Leu Ser Phe Gin Val Lys Asp 
210 215 22 0 

Asp lie Leu Asp lie Glu Ala Asp Thr Ala Thr Leu Gly Lys Thr Gin 
225 230 235 240 

Gly Lys Asp lie Asp Asn Asp Lys Pro Thr Tyr Pro Ala Leu Leu Gly 
245 250 255 

Met Ala Gly Ala Lys Gin Lys Ala Gin Glu Leu His Glu Gin Ala Val 
260 265 270 

Glu Ser Leu Thr Gly Phe Gly Ser Glu Ala Asp Leu Leu Arg Glu Leu 
275 280 285 
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Ser Leu Tyr lie lie Glu Arg Thr His 
290 295 



<210> 15 
<211> 1533 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<223> ORF8 



<400> 15 

atggccaaca 

atgttgctga 

ggccgcaacc 

ttgatgaaag 

ctggaattcc 

gtctattccg 

gacggctacg 

atcacccgcg 

ccgtggctgg 

aaaatgcgcc 

ccggcactgt 

ggcggcctga 

attcacttga 

aaattacaac 

cacgcgatga 

aagcagcgcg 

gatctgccgc 

ttcgacaaca 

gacgacagcc 

aacgacagcg 

acgctgggcg 

atcacgccgc 

tcgcacaagt 

aattgctatc 

tcggcgcgga 

gcacacagcg 



ccaaacacat 
gccagcgcgg 
gcccgatcaa 
gcgtgctgga 
tgccgctaag 
accgcgagaa 
aacagttcat 
attattccag 
cttttccgaa 
tggccttttg 
ttacgatgct 
accgcatcgc 
acagcgaaat 
atggcgcgga 
cgcatctggt 
agtattcctg 
accataccat 
aaaccctgac 
tagcgccagc 
gcctggactg 
cgcgactggg 
aaacctggga 
tcagccaaat 
tggtcggcgg 
tttcggccaa 
cctggctgaa 



catcatcgtc 
cttcaaggta 
catgaacggc 
cgaaatgttc 
cccgatgtac 
catgcgcgcc 
ggaacaggaa 
cctgaaatcc 
aagcgtgttc 
ctttcagtcc 
gccctatctg 
ggcggcgatg 
cgagtcgctg 
gctgcgcggc 
caaaccgggc 
ttcgaccttc 
cgtgtttgcc 
ggacgatttt 
cggcaaatcg 
gcaggcgcat 
attgagcgac 
aacggacgaa 
gctgtactgg 
cggcacgcat 
gctgatttcc 
aaaagccaaa 



ggcgcgggtc 
tcgattttcg 
tttaccttcg 
gaactgtgcg 
cgcctgctgt 
gaattgcaac 
cgcaaacgct 
tttttgtcgc 
aataatctcg 
aagtatctgg 
gagcacgaat 
gcgcaagtga 
atcatcgaaa 
gacgaagtca 
gtcttgaaaa 
atgctgtatc 
aaggattaca 
tcgttttacg 
gcgctgtacg 
tgccaaaacg 
atcagagccc 
cacgtttaca 
cggccgcaca 
cccggtagcg 
cagaaacatc 
gcc 



ccggcggact 
acaaacatgc 
ataccggtcc 
agcgccgtag 
acgacgaccg 
gggtattcga 
tcaacgcgct 
tggacttgat 
gccagtattt 
gcatgtcgcc 
acggcattta 
tcgcggaaaa 
acggcgctgc 
tcatcaacgc 
aatacacccc 
tgggtttgga 
ccaccaatat 
tgcaaaacgc 
tgctggtgcc 
tgcgcgaaca 
atatcgaatg 
agggcgccac 
accgtttcga 
gtttgccgac 
gggtgaggtt 



ttgcgccggc 
agaaatcggc 
gacattcttg 
cgaggattat 
cgacatcttc 
cgaaggcacg 
gtatccctgc 
caaggccctg 
caaccaggaa 
gtgggaatgc 
tcacgtcaaa 
cggcggcgaa 
caagggcgtc 
ggattttgcc 
ggaaaacctg 
caagatttac 
ccgcaacatt 
cagcgccagc 
gatgcccaac 
ggtgttggac 
cgaaaaaatc 
tttcagtttg 
ggaactggcc 
catctacgaa 
caaggacata 



60 
120 
180 
24 0 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1533 



<210> 16 
<211> 511 
<212> PRT 

<213> Methylomonas 16a 
<220> 

<22 3> Amino acid sequences encoded by ORF8 
<400> 16 

Met Ala Asn Thr Lys His lie lie lie Val Gly Ala Gly Pro Gly Gly 
15 10 15 

Leu Cys Ala Gly Met Leu Leu Ser Gin Arg Gly Phe Lys Val Ser lie 
20 25 30 

Phe Asp Lys His Ala Glu lie Gly Gly Arg Asn Arg Pro lie Asn Met 
35 40 45 
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Asn Gly Phe Thr Phe Asp Thr Gly Pro Thr Phe Leu Leu Met Lys Gly 
5 0 55 60 



Val Leu Asp Glu Met Phe Glu Leu Cys Glu Arg Arg Ser Glu Asp Tyr 
65 70 75 ~ 80 

Leu Glu Phe Leu Pro Leu Ser Pro Met Tyr Arg Leu Leu Tyr Asp Asp 
85 90 95 

Arg Asp lie Phe Val Tyr Ser Asp Arg Glu Asn Met Arg Ala Glu Leu 
100 105 110 

Gin Arg Val Phe Asp Glu Gly Thr Asp Gly Tyr Glu Gin Phe Met Glu 
115 120 125 

Gin Glu Arg Lys Arg Phe Asn Ala Leu Tyr Pro Cys lie Thr Arg Asp 
130 135 140 

Tyr Ser Ser Leu Lys Ser Phe Leu Ser Leu Asp Leu lie Lys Ala Leu 
145 150 155 160 

Pro Trp Leu Ala Phe Pro Lys Ser Val Phe Asn Asn Leu Gly Gin Tyr 
165 170 175 

Phe Asn Gin Glu Lys Met Arg Leu Ala Phe Cys Phe Gin Ser Lys Tyr 
180 185 190 

Leu Gly Met Ser Pro Trp Glu Cys Pro Ala Leu Phe Thr Met Leu Pro 
195 200 205 

Tyr Leu Glu His Glu Tyr Gly lie Tyr His Val Lys Gly Gly Leu Asn 
210 215 220 

Arg lie Ala Ala Ala Met Ala Gin Val lie Ala Glu Asn Gly Gly Glu 
225 230 235 240 

lie His Leu Asn Ser Glu lie Glu Ser Leu lie lie Glu Asn Gly Ala 
245 250 255 

Ala Lys Gly Val Lys Leu Gin His Gly Ala Glu Leu Arg Gly Asp Glu 
260 265 270 

Val lie lie Asn Ala Asp Phe Ala His Ala Met Thr His Leu Val Lys 
275 280 285 

Pro Gly Val Leu Lys Lys Tyr Thr Pro Glu Asn Leu Lys Gin Arg Glu 
290 295 300 

Tyr Ser Cys Ser Thr Phe Met Leu Tyr Leu Gly Leu Asp Lys lie Tyr 
305 310 315 320 

Asp Leu Pro His His Thr lie Val Phe Ala Lys Asp Tyr Thr Thr Asn 
325 330 335 

lie Arg Asn lie Phe Asp Asn Lys Thr Leu Thr Asp Asp Phe Ser Phe 
340 345 350 

Tyr Val Gin Asn Ala Ser Ala Ser Asp Asp Ser Leu Ala Pro Ala Gly 
355 360 365 
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Lys Ser Ala Leu Tyr Val Leu Val Pro Met Pro Asn Asn Asp Ser Gly 
370 375 380 



Leu Asp Trp Gin 
385 

Thr Leu Gly Ala 



Cys Glu Lys lie 
420 

Tyr Lys Gly Ala 
435 

Tyr Trp Arg Pro 
450 

Val Gly Gly Gly 
465 

Ser Ala Arg lie 



Phe Lys Asp lie 
500 



Ala His Cys Gin 
390 

Arg Leu Gly Leu 
405 

lie Thr Pro Gin 



Thr Phe Ser Leu 
440 

His Asn Arg Phe 
455 

Thr His Pro Gly 
470 

Ser Ala Lys Leu 
485 

Ala His Ser Ala 



Asn Val Arg Glu 
395 

Ser Asp lie Arg 
410 

Thr Trp Glu Thr 
425 

Ser His Lys Phe 



Glu Glu Leu Ala 
460 

Ser Gly Leu Pro 
475 

lie Ser Gin Lys 
490 

Trp Leu Lys Lys 
505 



Gin Val Leu Asp 
400 

Ala His lie Glu 
415 

Asp Glu His Val 
430 

Ser Gin Met Leu 
445 

Asn Cys Tyr Leu 



Thr lie Tyr Glu 
480 

His Arg Val Arg 
495 

Ala Lys Ala 
510 



<210> 17 
<211> 1491 
<212> DNA 

<213> Methylomonas 16a 
<220> 

<223> ORF9 



<400> 17 

atgaactcaa 

gccgctattt 

gtcggcggca 

attttgacga 

gattacgtgc 

gtgatcgact 

ggcacttacg 

gaagccggtt 

ccgctccgca 

tttatttccg 

tcgccttacg 

ctgtggtacg 

gaattgggcg 

agagcctgcg 

aacatggaag 

aaaatgcagc 

ctgtatccgc 

gatgcggtat 

tgcaagaccg 

atcccgcacc 

cgggtgctgg 

gaagaatact 

tacggcgtgg 



atgacaacca 
cgctggccac 
agctcaacat 
tgccgcacat 
aaatccagaa 
tgtgcgaaga 
cgcaattcca 
acttcgccaa 
gcctgctgag 
atcccaagtt 
atgcgcccgc 
tgaaaggcgg 
tcgagattcg 
ccgtaaagtt 
tgattccggc 
gcttcgagcc 
aactggcgca 
tcaaaagcca 
accccgccca 
tcgaccccga 
tcaaactcga 
ggacgccgct 
tcgccgaccg 



acgcgtgatc 
ggccggcttt 
catgaccaaa 
ctttgaggcc 
agtcgaaccg 
cgccgaaacc 
gcgctttctg 
gggcctggac 
tttcgacgtc 
ggtcgaaatc 
cttgatgaac 
catgtatggc 
tttagatgcc 
ggcgaacggc 
gatggaaaaa 
tagctgttcc 
ccacaatttc 
tcgcctgtcg 
ggcgccggcc 
caaactgctg 
acgcatgggc 
ggatattcag 
cttcaaaaac 



gtgatcggcg 
tccgtgcaac 
gacggcttta 
ttgttcacag 
cactggcgca 
cagcgccgcg 
gactattcga 
ggcttttggg 
ttccgcagca 
ctgaattact 
ctgctgcctt 
atggcgcagg 
gaggtgtcgg 
gacgtgctgc 
ctgctgcgca 
ggcctggtgc 
ttttattccg 
gacgatccga 
ggctgcgaga 
accgccgagg 
ctgacggatt 
gccaaatatt 

ct gggtttca 
16 



ccggcctcgg 
tcatcgaaaa 
ccttcgatct 
gggccggcaa 
atttcttcga 
agctggataa 
aaaacctctg 
atttactcaa 
tggaccaggg 
tcatcaaata 
acattcaata 
ccatggaaaa 
aaatccaaaa 
cggccgacat 
gcccggccag 
tgcacttggg 
atcatccgcg 
ccatttatct 
tcatcaaaat 
attattcagc 
tacgccaaca 
attcaaacca 
aggcacctca 



cggcctgtcc 
aaacgacaag 
ggggccgtcc 
aaacatggcc 
ggacggtagc 
acttggcccc 
cacggaaacc 
gttttacggc 
cgtgcgccgc 
cgtcggctcc 
tcattacggc 
actggccgtg 
acaggacggc 
cgtggtgtcg 
cgaactgaaa 
cgtggacagg 
cgaacatttc 
ggtcgcgccg 
cctgccccat 
cttgcgcgag 
catcgtgacc 
gggctcgatt 
acgcagcagc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



gaattatcca atctgtattt cgtcggcggc agcgtcaatc ccggcggcgg catgccgatg 1440 
gtgacgctgt ccgggcaatt ggtgagggac aagattgtgg cggatttgca a 1491 



<210> 18 
<211> 497 
<212> PRT 

<213> Methylomonas 16a 
<220> 

<223> Amino acid sequences encoded by ORF9 
<400> 18 

Met Asn Ser Asn Asp Asn Gin Arg Val lie Val lie Gly Ala Gly Leu 
15 10 15 

Gly Gly Leu Ser Ala Ala lie Ser Leu Ala Thr Ala Gly Phe Ser Val 
20 25 30 



Gin Leu lie Glu Lys Asn Asp Lys 
35 40 

Thr Lys Asp Gly Phe Thr Phe Asp 
50 55 

Pro His lie Phe Glu Ala Leu Phe 
65 70 

Asp Tyr Val Gin lie Gin Lys Val 
85 



Val Gly Gly Lys Leu Asn lie Met 
45 

Leu Gly Pro Ser lie Leu Thr Met 
60 

Thr Gly Ala Gly Lys Asn Met Ala 
75 80 

Glu Pro His Trp Arg Asn Phe Phe 
90 95 



Glu Asp Gly Ser Val lie Asp Leu Cys Glu Asp Ala Glu Thr Gin Arg 
100 105 110 

Arg Glu Leu Asp Lys Leu Gly Pro Gly Thr Tyr Ala Gin Phe Gin Arg 
115 120 125 

Phe Leu Asp Tyr Ser Lys Asn Leu Cys Thr Glu Thr Glu Ala Gly Tyr 
130 135 140 

Phe Ala Lys Gly Leu Asp Gly Phe Trp Asp Leu Leu Lys Phe Tyr Gly 
145 150 155 160 

Pro Leu Arg Ser Leu Leu Ser Phe Asp Val Phe Arg Ser Met Asp Gin 
165 170 175 

Gly Val Arg Arg Phe lie Ser Asp Pro Lys Leu Val Glu lie Leu Asn 
180 185 190 

Tyr Phe lie Lys Tyr Val Gly Ser Ser Pro Tyr Asp Ala Pro Ala Leu 
195 200 205 

Met Asn Leu Leu Pro Tyr lie Gin Tyr His Tyr Gly Leu Trp Tyr Val 
210 215 220 

Lys Gly Gly Met Tyr Gly Met Ala Gin Ala Met Glu Lys Leu Ala Val 
225 230 235 240 

Glu Leu Gly Val Glu lie Arg Leu Asp Ala Glu Val Ser Glu lie Gin 
245 250 255 
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Lys Gin Asp Gly Arg Ala Cys Ala Val Lys Leu Ala Asn Gly Asp Val 
260 265 270 



Leu Pro Ala Asp lie Val Val Ser Asn Met Glu Val lie Pro Ala Met 
275 280 285 

Glu Lys Leu Leu Arg Ser Pro Ala Ser Glu Leu Lys Lys Met Gin Arg 
290 295 300 

Phe Glu Pro Ser Cys Ser Gly Leu Val Leu His Leu Gly Val Asp Arg 
305 310 315 320 

Leu Tyr Pro Gin Leu Ala His His Asn Phe Phe Tyr Ser Asp His Pro 
325 330 335 

Arg Glu His Phe Asp Ala Val Phe Lys Ser His Arg Leu Ser Asp Asp 
340 345 350 

Pro Thr lie Tyr Leu Val Ala Pro Cys Lys Thr Asp Pro Ala Gin Ala 
355 360 365 

Pro Ala Gly Cys Glu lie lie Lys lie Leu Pro His lie Pro His Leu 
370 375 380 

Asp Pro Asp Lys Leu Leu Thr Ala Glu Asp Tyr Ser Ala Leu Arg Glu 
385 390 395 400 

Arg Val Leu Val Lys Leu Glu Arg Met Gly Leu Thr Asp Leu Arg Gin 
405 410 415 

His lie Val Thr Glu Glu Tyr Trp Thr Pro Leu Asp lie Gin Ala Lys 
420 425 430 

Tyr Tyr Ser Asn Gin Gly Ser lie Tyr Gly Val Val Ala Asp Arg Phe 
435 440 445 

Lys Asn Leu Gly Phe Lys Ala Pro Gin Arg Ser Ser Glu Leu Ser Asn 
450 455 460 

Leu Tyr Phe Val Gly Gly Ser Val Asn Pro Gly Gly Gly Met Pro Met 
465 470 475 480 

Val Thr Leu Ser Gly Gin Leu Val Arg Asp Lys lie Val Ala Asp Leu 
485 490 495 

Gin 



<210> 19 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : primer 



<400> 19 

aaggatccgc gtattcgtac tc 
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<210> 20 

<211> 40 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : primer 
<400> 20 

ctggatccga tctagaaata ggctcgagtt gtcgttcagg 40 



<210> 21 

<211> 30 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : primer 
<400> 21 

aaggatccta ctcgagctga catcagtgct 30 



<210> 22 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : primer 
<400> 22 

gctctagatg caaccagaat eg 22 



<210> 23 
<211> 954 
<212> DNA 

<213> Methylomonas 16a 



<400> 23 

atgeaaateg tactegcaaa cccccgtgga ttctgtgccg gcgtggaccg ggccattgaa 60 
attgtcgatc aagecatega agcctttggt gcgccgattt atgtgcggca cgaggtggtg 120 
cataaccgca ccgtggtcga tggactgaaa caaaaaggtg cggtgttcat cgaggaacta 180 
agcgatgtgc cggtgggttc ctacttgatt ttcagcgcgc aeggegtate caaggaggtg 240 
caacaggaag ccgaggagcg ccagttgacg gtattcgatg cgacttgtcc gctggtgacc 3 00 
aaagtgcaca tgcaggttgc caagcatgcc aaacagggee gagaagtgat tttgategge 360 
cacgccggtc ateeggaagt ggaaggcacg atgggccagt atgaaaaatg caccgaaggc 4 20 
ggcggcattt atctggtcga aactceggaa gaegtacgea atttgaaagt caacaatccc 4 80 
aatgatctgg cctatgtgac gcagacgacc ttgtcgatga ccgacaccaa ggtcatggtg 540 
gatgegttae gcgaacaatt tccgtccatt aaggagcaaa aaaaggacga tatttgttac 600 
gcgacgcaaa acegtcagga tgcggtgcat gatctggeca agatttccga cctgattctg 660 
gttgtcggct ctcccaatag ttcgaattcc aaccgtttgc gtgaaatege cgtgcaactc 720 
ggtaaacccg cttatttgat cgatacttac caggatttga agcaagattg gctggaggga 780 
attgaagtag teggggttae cgcgggcgct tcggcgccgg aagtgttggt gcaggaagtg 84 0 
atcgatcaac tgaaggcatg gggcggcgaa accacttcgg tcagagaaaa cagcggcatc 900 
gaggaaaagg tagtcttttc gattcccaag gagttgaaaa aacatatgea ageg 954 
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<210> 24 
<211> 318 
<212> PRT 

<213> Methylomonas 16a 
<400> 24 

Met Gin lie Val Leu Ala Asn Pro Arg Gly Phe Cys Ala Gly Val Asp 
15 10 15 

Arg Ala lie Glu lie Val Asp Gin Ala lie Glu Ala Phe Gly Ala Pro 
20 25 30 

lie Tyr Val Arg His Glu Val Val His Asn Arg Thr Val Val Asp Gly 
35 40 45 

Leu Lys Gin Lys Gly Ala Val Phe lie Glu Glu Leu Ser Asp Val Pro 
50 55 60 

Val Gly Ser Tyr Leu lie Phe Ser Ala His Gly Val Ser Lys Glu Val 
65 70 75 80 

Gin Gin Glu Ala Glu Glu Arg Gin Leu Thr Val Phe Asp Ala Thr Cys 
85 90 95 

Pro Leu Val Thr Lys Val His Met Gin Val Ala Lys His Ala Lys Gin 
100 105 110 

Gly Arg Glu Val lie Leu lie Gly His Ala Gly His Pro Glu Val Glu 
115 120 125 

Gly Thr Met Gly Gin Tyr Glu Lys Cys Thr Glu Gly Gly Gly lie Tyr 
130 135 140 

Leu Val Glu Thr Pro Glu Asp Val Arg Asn Leu Lys Val Asn Asn Pro 
145 150 155 160 

Asn Asp Leu Ala Tyr Val Thr Gin Thr Thr Leu Ser Met Thr Asp Thr 
165 170 175 

Lys Val Met Val Asp Ala Leu Arg Glu Gin Phe Pro Ser lie Lys Glu 
180 185 190 

Gin Lys Lys Asp Asp lie Cys Tyr Ala Thr Gin Asn Arg Gin Asp Ala 
195 200 205 

Val His Asp Leu Ala Lys lie Ser Asp Leu lie Leu Val Val Gly Ser 
210 215 220 

Pro Asn Ser Ser Asn Ser Asn Arg Leu Arg Glu lie Ala Val Gin Leu 
225 230 235 240 

Gly Lys Pro Ala Tyr Leu lie Asp Thr Tyr Gin Asp Leu Lys Gin Asp 
245 250 255 

Trp Leu Glu Gly lie Glu Val Val Gly Val Thr Ala Gly Ala Ser Ala 
260 265 270 
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Pro Glu Val Leu Val Gin Glu Val lie Asp Gin Leu Lys Ala Trp Gly 
275 280 285 



Gly Glu Thr Thr Ser Val Arg Glu Asn Ser Gly lie Glu Glu Lys Val 
290 295 300 

Val Phe Ser lie Pro Lys Glu Leu Lys Lys His Met Gin Ala 
305 310 315 
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